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techniques  for  solving  hard  combinatorial  optimization  problems  with 


random  inputs.  We  describe  a  randomized  algorithm  for  efficiently 
constructing  an  independent  set  of  fixed  size  in  an  instance  of  a 


random  independence  system.  We  provide  a  general  method  of  analysis 
of  the  performance  of  this  algorithm,  which  allows  us  to  derive  bounds 
on  the  mean,  variance  and  all  the  moments  of  the  time  complexity  of 
the  algorithm. 


untflUiilh  it  il  .•j.iJ.-Jt  | 


1.  Introduction 


In  a  classic  paper  "On  the  Abstract  Properties  of  Linear  Dependence" 
of  1935,  Whitney  provided  a  set  of  axioms  for  a  structure  called  here  a 
Whitney  matroid.  Matroid  theory  (see  [Tutte,  1971],  [Lawler,  1976])  has 
applications  to  a  wide  class  of  combinatorial  optimization  problems: 
where  we  wish  to  construct  a  maximal  object  (a  maximum  independent  set) 
satisfying  a  monotone  property . 

We  introduce  in  this  paper  (Section  2)  the  random  independence 
system  (ris)  which  is  applicable  to  a  more  general  class  of  combinatorial 
optimization  problems  with  random  inputs .  We  define  some  natural  notions, 
such  as  "maximal  with  a  given  probability."  Properties  of  random  indepen¬ 
dence  systems  «uch  as  the  existence  of  an  independent  set  of  given 
cardinality  (with  probability  1) ,  the  relationship  between  RIS  and  Whitney 
matroids  and  properties  of  intersections  of  RIS  are  discussed  in  a 
companion  paper  [Reif,  Spirakis,  81]  (see  alf.  o  Section  2  of  a  previous 
draft  of  this  paper,  [Reif,  Spirakis,  803).  In  that  paper  we  describe  a 
nonconstructive  proof  technique  for  determining  (with  probability  1)  the 
existence  of  an  independent  set  of  given  cardinality  or  given  weight  in  an 
instance  of  a  random  independence  system. 

In  this  paper,  we  develop  a  randomized  algorithm,  the  extension- 
rotation  (E-R)  algorithm,  for  efficiently  constructing  an  independent 
set  of  a  given  size  h^  in  an  instance  of  an  RIS.  Given  an  independent 
set  I  of  size  less  than  h^,  we  attempt  to  extend  I  (by  adding  a  new 
random  element  e  to  I)  or  else  attempt  to  rotate  I  (by  deleting  an 
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element  •'  of  Z  and  adding  tho  now  element  a).  The  use  of  a  rotation 
operation  first  appeared  in  Posa's  [1976]  existence  proof  for  a  Hamiltonian 
path  in  an  undirected  random  graph  of  density  O(log(n)/n).  (Karp,  1976} 
and  (Angluin  and  Valiant,  1979]  consider  random  algorithms  with  extensions 
and  rotations. 

The  introduction  of  the  rotation  operation  seems  necessary  for 
certain  independence  systems,  since  the  greedy  algorithm  (which  utilizes 
only  extensions)  may  have  arbitrarily  bad  performance  (see  [Korte, 

Hausmann,  78]).  We  show  that  the  probability  density  of  the  number  of 
rotation  steps  between  successive  extensions  is  upper  and  lower  bounded 
by  geometric  density  functions.  From  these  bounds  we  derive  sufficient 
conditions  (a  lower  bound  on  the  element  density)  for  the  E  -  R  algorithm 
to  succeed,  with  arbitrarily  high  probability.  Also,  we  can  derive  bounds 
on  the  probability  density  function  of  the  total  number  of  steps,  and  from 
these  density  functions  derive  bounds  on  the  mean,  variance  and  all  the 
moments  of  the  time  complexity  of  the  algorithm.  Thus  we  have  a  general 
method  for  analysis  of  the  performance  of  the  random  extension-rotation 
algorithm.  We  view  this  as  the  most  significant  contribution  of  the  paper. 

We  also  give  some  applications  to  random  graphs  G  (see 

n ,  p 

Section  2.3  and  (Erdos  and  Spencer,  1974]). 

PI  Construct  a  Hamiltonian  path  in  G 

—  n,p 

PI 1  For  a  graph  H  of  fixed  size,  construct  a  subgraph 

of  G  homeomorphic  to  H. 
n,p 

P2  Construct  a  perfect  matching  in  G 

—  n,p 

P2*  Construct  a  perfect  matching  in  a  random  bipartite  graph 

B 

n.p 

(Note  that  Pi'  is  a  generalization  of  Pi.) 
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The  randomised  E-R  algorithm  is  applicable  to  Pi,  P2  and  P2*  (and 
we  have  an  efficient  transformation  from  instances  of  Pi*  to  instances  of 
PI) . 

The  results  of  our  general  method  for  analysis  of  the  extension- 

rotation  algorithm  yield  lower  bounds  for  the  edge  density  p  to  give 

probability  of  success  1  -  n”a  for  a>l.  Previously  tErdos  and  Renyi, 

1959]  have  considered  PI  and  (Posa,  1976]  considers  PI  for  undirected 

graphs.  [Angluin  and  Valiant,  1979]  consider  Pi  and  P2  for  directed 

graphs.  They  derive  similar  results  for  a  different  random  graph  model 

G  „  and  their  results  hold  for  G  only  under  certain  conditions  as 
n,N  n,p 

n  +  °°. 

Our  general  method  also  yields  significant  new  results  for  these 
applications,  such  as  tight  bounds  (within  a  constant  multiple)  on  the 
mean  and  variance  of  the  randomized  algorithm’s  time  complexity. 


2.  Definitions  of  Random  Independence  Systems  and  Their  Structure 

2.1  Definitions  of  Random  Independence  Systems 

Let  E  be  a  set  and  let  £  be  a  family  of  subsets  of  E.  Let  p 
be  a  real  number  (the  element's  density )  cn  the  interval  10,1] .  The 
triple  M*  (E,^f,p)  is  a  (uniform)  random  independence  system  (RIS) . 
(Nonuniform  random  independence  systems  and  weighted  random  independence 
systems  are  defined  in  [Reif,  Spirakis,  1981]).  We  will  frequently  write 
(E,^,l)  as  (E,  />•  M  ■  (E#<Jf,p)  is  a  proper  random  independence  system 
if 

A1  $€/  . 
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Intuitively,  /  may  be  considered  •  property  on  subsets  of  E  which  is 
trivially  satisfied  (by  axiom  Al)  and  monotone  deereasing  (by  axiom  hi) . 
Let  (E,^)  be  a  Whitney  matroid  (a  matroid  as  defined  by  tWhitney,  1932]) 
if  it  satisfies  Al,  A2  and  the  additional  axiom 

A3  For  any  sets  A, A'  of  cardinality  h,h+l  respectively, 
3e €  A '  -  A  such  that  AU  {e}  €/. 

2*2  Instances  of  Random  Independence  Systems 
An  instance  of  a  random  independence  system  M«  (E<t/,p)  is  a  pair 

Mo"  (Fo'fo}  where 

(i)  Eqce  is  derived  by  independently  choosing  each  eCE  with 
probability  p. 

....  w  , ,  ,  , 

Note  that  the  probability  of  MQ  is  p  (1  -  p)  .  Clearly,  any 

instance  MQ*  of  *  proper  RIS  satisfies  axioms  Al  and  A2. 

(Hence,  any  instance  of  a  proper  RIS  is  an  independence  system,  as  defined 
in  iKorte  and  Hausmann,  1978].) 

A  set  aceq  is  independent  in  mq  if  ac^  and  dependent  other¬ 
wise.  An  independent  set  I€/o  is  maximum  in  MQ  if  there  does  not 
exist  an  I*  such  that  |l'|  >  Jl|.  Let  the  rank  of  MQ  be  the 

cardinality  of  a  maximum  independent  set.  l€*>  is  maximal  in  MQ  if 
there  does  not  exist  an  I'  £/o  such  that  i' si.  a  minimal  dependent- 
set  of  Mq  (a  circuit)  has  no  proper  subset  which  is  dependent  in  MQ. 

For  any  Aceq  let  the  rank  of  A  in  MQ  be  the  maximum  cardinality  of 
any  independent  subset  of  A,  It  follows  from  a  result  of  [Korte, 

Hausmann,  1978]  that  for  any  instance  Mq  of  a  proper  RIS  there  exists  an 
integer  k  and  k  matroids  of  which  the  instance  is  an  intersection. 


s 


3*3  Examples  of  Random  Independence  Systems 

As  sn  example  of  an  RIS,  let  Q  be  a  property  on  graphs  and  let 

6  be  a  random  undirected  graph.  (Examples  can  be  given  for  directed 
n»p 

graphs  also.  For  sake  of  simplicity  here  we  give  only  examples  for  random 

undirected  graphs.)  The  random  graph  G  is  a  random  variable  whose 

n#p 

instances  are  graphs  with  vertices  V  » (l,2, . . . ,n)  and  each  edge  chosen 

independently  with  probability  p  from  the  set  E»{{u,v}/u,v  are 

distinct  vertices  of  v}.  Let  M»  (E#</,p)  be  the  (uniform)  RIS  with  E 

as  above  and  {ieE/Q(V,I)holds} .  Then  any  instance  MQ  *  (e0»</q) 

of  M  corresponds  to  an  instance  (v,Eft)  of  the  random  graph  G 

u  n#  p 

»nd  /0  contains  precisely  those  edge  sets  ICEQ  such  that  the  property 
Q  holds  for  subgraph  (V,I).  Note  that  M  is  a  proper  RIS  if  the 
graph  property  Q  is 

(1)  trivially  satisfied,  i.e.,  Q  holds  for  the  graph  with  no 
edges  and 

(2)  decreasing  monotone :  Q(G)  ■*Q(G’)  for  all  subgraphs  G'  of  G. 
We  list  some  graph  properties  and  the  corresponding  RIS  below. 

Pi  (Hamiltonian  paths) 

Given  a  graph  G»  (V,E),  a  simple  path  is  a  path  of  edges  in  E 
containing  no  cycles,  and  it  is  a  Hamiltonian  path  if  it  contains  every 
vertex  of  V.  The  property  of  a  "simple  path"  in  a  random  graph  does  not 
yield  a  proper  RIS,  since  a  simple  path  must  be  connected  (violating 
axiom  A2) .  However,  we  can  define  a  proper  RIS  such  that  any  independent 
set  of  cardinality  |v|  -1  is  a  Hamiltonian  path.  We  give  both 


formulations  here: 
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Formulation  as  a  non-proper  RIS;'  Lat  M>  (s,^f,p)  ba  tha  Ms  whara 
/  it  the  sat  of  all  simple  paths  in  the  complete  graph  (V,E) .  Fix  an 
instance  M0“(E0,y  )  of  M.  Then  (V,Eq)  has  tha  same  probability  in 
random  graph  G^  as  in  H  and  is  tha  sat  of  all  simple  paths  in 

w.  v  • 

Formulation  as  a  proper  RIS:  Let  M ■  (E,^T,p>  ba  the  RIS  with  E 
as  above  and  {1ce/(V,D  consists  of  a  sat  of  disjoint  simple  paths}. 

Clearly  M  satisfies  axioms  Al,  A2.  Fix  an  instance  Mq"(Eq, 

of  M.  Then  (V,E_)  has  the  same  probability  in  G  as  in  H  and 

o  n#p 

has  as  elements  all  different  sets  of  disjoint  simple  paths  in  E^. 

In  both  formulations,  if  has  an  independent  set  such 

that  |l|»n-l  then  (V,I)  is  a  Hamiltonian  line  in  (V,E) . 

P2  Perfect  matchings 

An  edge  matching  of  a  graph  is  a  set  of  vertex  disjoint  edges,  and 
is  perfect  if  every  vertex  appears  in  some  edge  of  the  matching .  To 
formulate  the  "perfect  matching"  problem  as  an  RIS,  we  assume  a  complete 
graph  G=(V,E)  with  2n  vertices. 

M *  (E,  ^T«p)  where  /=  fa  c E/I  a  matching}. 


Let  Mq»  be  an  instance  of  M*  Then  Mo  has  a  Perfect  matching 

if  there  is  an  I  €^T0  such  that  |l|  «  n.  The  property  of  "matching"  in  a 


random  graph 


yields  a  proper  RIS,  since  if 


is  a  matching  then 


every  I*  cl  is  a  matching. 


F2 1  Bipartite  matching 

In  the  following  let  V1  * {l , . . . ,n} ,  V2 *  (n+1, . . . , 2n}  be  disjoint 
vertex  sets  of  equal  cardinality,  and  let  E  =  {{u,v}/u  €  ,  v  €  V,,}  . 


k  bipartite  graph  B*  {V^UVo,E0>  has  vertex  set  V^UV^  and  edge 
set  Eqce.  B  is  complete  if  eq  =  e.  a  random  bipartite  graph  Bn 
has  instances  which  are  bipartite  graphs  (V^UV^Eq)  where  each  edge 
of  Eq  is  chosen  from  E  with  probability  p. 

An  (edge)  matching  of  bipartite  graph  (V^UV^E^)  is  a  set  of 
vertex  disjoint  edges  ICE^  anc^  as  P orfect  if  every  vertex  of  U  V2 
appears  in  some  edge  of  I.  The  bipartite  perfect  matching  problem  is 
formulated  as  a  proper  RIS  by  assuming  a  complete  bipartite  graph 
B=(V1UV2,E).  Let  M=(E,Jf,p)  where  ^f={lc=E/I  is  a  (bipartite) 
matching}.  Let  MQ  be  an  instance  of  M.  has  a  perfect  matching 

if  there  is  an  I  such  that  |l|  =  n. 

3.  The  E-R  Algorithm  for  Constructing  Independent  Sets 

In  this  section  we  describe  an  efficient  algorithm  for  constructing 
an  independent  set  of  fixed  size  from  an  instance  of  a  random  independence 
system.  This  E-R  algorithm  is  a  generalization  of  random  graph  algo¬ 
rithms  which  have  appeared  in  [Posa,  1976] ,  [Karp,  1976) ,  and  [Angluin 
and  Valiant-  1979] .  In  Section  5  we  develop  a  general  method  of 
analysis  of  the  E-R  algorithm  which  provides: 

(i)  Sufficient  conditions  for  successful  termination  with 

-a 

probability  1-  |e|  for  any  fixed  sufficiently  large  c*q  >  1 
(ii)  Upper  and  lower  bounds  on  the  probability  density  of  the  time 
cost  of  the  E-R  algorithm,  from  which  the  mean,  variance  and 
all  the  moments  of  the  time  cost  may  be  derived. 


Section  4  gives  a  simplified  discussion  of  that  analysis,  which  is 
intended  to  aid  the  reader's  intuition  and  lead  to  the  more  thorough 
analysis  of  Section  5. 

Let  Mq  =  be  an  instance  of  the  random  independence  system 

M*  (E,  $,p)  •  We  wish  to  construct  an  independent  set  of  size  hQ>0. 

For  any  independent  set  16^,  let  $(I)  =  (e  £  1 1  U  {e}  . 

Note  that  if  &(1)  +  0  then  we  may  extend  I  by  choosing  an  e£<?(I) 

and  substituting  IU  {e}  for  I. 

Also,  for  any  independent  set  i£JTq,  let  01(1)  =  {e  €  Eg|  I  U  {e} 
but  3e'€l  with  lU{e}-{e'}£/0}.  If  #(I>  I*  0,  we  may  rotate  I 
by  choosing  an  e£^(I)  and  some  appropriate  e'£l  and  substituting 
I  U  {e}  -  {e'}  for  I. 

Actually,  in  the  algorithm  below,  we  choose  a  random  element 
e£«f(I)  U  01(1)  and  first  attempt  to  extend  I  by  e,  and  else  rotate  I 
by  e.  We  call  S(l)  the  extension  set  of  I  and  0t( I)  the  rotation 

set  of  I. 

3.1  The  E-R  Algorithm 

INPUT:  An  instance  MQ  =  (Eg '  of  a  random  -independence  system 

M=  (E,  /.p)  and  integer  hQ>0. 

INITIALIZATION:  I  «- 0;  T+0 

WHILE  1 1 1  <  hQ  DO 
BEGIN 

IF  £  (I)  Utf?T(I)  =  0  THEN  FAIL 
choose  some  random  e  €  <S‘T  ( I )  U  d?T(I) 

IF  e  £<?_,(!)  THEN  EXTEND:  I  +  I  ‘J  {e} 
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ELSE  BEGIN 

choose  e 1  6  I  with  (I  U  {e})  -  {e’}  6^ 
ROTATE:  I  ■**  I  U  {e}  -  {e  '  } 

END 


T+T  +  1 


E  *■  E 
T  T-l 


END 

RETURN  (I)  . 


{e} 


We  define  the  sets: 

<?T(I)  -  {e€ET|lU{e}€,/0}, 

^?T(I)  =  {e€  ET|l  U  {e}  ,  but  3e’£l  with  I  U  {e}  -  {e  ’ }  e/Q} 

as  "macros”  which  are  expanded  in-line  within  the  E-R  algorithm. 


For  the  problem  of  perfect  matchings  in  random  graphs  G 
extension  and  rotation  sets  are  defined  us  follows:  Let  MQ = 
be  an  instance  of  the  matching  RIS  and  It^Q.  Then 


the 

n,p 

■‘Wo* 


<?(I)  =  {e€E-l|the  vertices  of  e  are  distributed  from  the 
vertices  of  i} 

and 

d?(I)  =  {e€E-l|one  vertex  of  e  is  an  element  of  E-l}. 

For  the  bipartite  perfect  matching  problem  in  bipartite  random 
graphs  (V^.V^p)  with  |vi]  =  Iv^l  =  n,  the  extension  and  rotation  sets 
are  defined  as  follows:  Let  mq =  <EQ, be  an  instance  of  the 
bipartite  matching  RIS  and  let  Let  V^I)  =  set  of  vertices  in 

which  are  incident  to  edges  in  I,  for  i  =  l,2.  Fix  a  uCv^-V^d). 

Then 


<?T(I)  *  {{u,v}  €  EQ/v  %  V2(I)  } 
#T(I)  =  {{u,v}€E0/v€v2(I)}. 


In  case  of  an 
follows:  Let 


edge  e  selected  from  I),  the  rotation  is  done  as 

e'  «  {u',v}  be  the  (unique)  edge  of  I  such  that  e',e 


xo 


(in  the  E-R  algorithm)  have  v  as  a  common  vertex  in  V^.  Delete  e' 
from  I,  add  e  to  I  and  then  set  u  to  u'. 

For  the  Hamiltonian  line  problem  in  random  graphs  we  have 

(1)  For  the  formulation  as  a  non-proper  R1S: 

Let  be  a  non-maximal  simple  path.  We  let  V(I)  be  the 

vertices  of  I  and  let  ENDS  (I)  be  the  vertices  of  I  of  valence  <2. 
Then  the  extension  set  is  <?(!)  *  {e  €  Eq  -  1 1  e  ■  {u,v} ,  u€ENDS(I), 
v€  V  -  V(I)  }.  The  rotation  set  is  81(1)  -  {e  £  E  -  I  -  (I)  |  e  =  {u,v} , 

U 6  ENDS (I),  V  €  V (I)  -  ENDS (I) } . 

(2)  For  the  formulation  as  a  proper  RIS: 

Let  I  be  a  set  of  disjoint  simple  paths  which  is  not 

maximum.  Let  V(l)  and  ENDS (I)  be  as  in  (1).  Then  the  extension  set 
is 

${I)  =  {e£EQ- l|e=  {u,v},  u€ENDS(I),  v£V-V(I)} 

U  {e  €  EQ  -  I  j  e  =  {u,v}  ,  u€ENDS(I),  v£eNDS(I) 
and  u,v  are  in  different  paths  of  i}  . 

The  rotation  set  is 

8t(l)  =  {e€  E  -  I  -  <b(I)  |e  =  {u,v} 

and  (u£ENDS(I),  v  £  V  (I)  -  ENDS  (I) )  or 
(u,v  €  ENDS  (I ' )  for  some  path  I' cl)} 

[Korte,  Hausmann,  .1978)  proved  that  the  greedy  algorithm  performs  as 
follows  in  any  independence  system  M=  (E,^T)  .  Let  I  be  the  output  of 
the  greedy  and  Imax  the  maximum  (in  cardinality)  independent  set  of  M. 
If  M  can  be  written  as  an  intersection  of  k  matroids,  then 
1 |  >  1 1^^ | /k.  For  the  matching  problem,  k=2.  For  the  (proper)  RIS 
formulation  of  the  Hamiltonian  line  problem,  k  =  3.  Note  that  the  E-R 
algorithm  has  at  least  as  good  performance  as  the  greedy  algorithm. 


"-'fli’TPy?*TllT‘’ 


As  we  show  in  the  analysis,  if  the  probability  p  of  the  RIS  is 
bigger  than  a  certain  value,  then  rotation  succeeds  with  probability  one 
in  finding  short  augmentation  sequences  in  a  random  instance  of  the  RIS. 
Even  as  a  heuristic,  E-R  constructs  bigger  maximal  independent  sets 
than  greedy  and  has  the  same  worst-case  time  complexity  if  a  rotation 
element  can  be  always  found  in  fixed  time. 


3.2  Parameters  of  the  E-R  Algorithm 

We  wish  to  analyze  the  E-R  algorithm  relative  to  the  "time"  index 
T,  which  is  incremented  on  each  iteration  of  the  algorithm.  Note  that 
each  "unit  ^ime"  step  from  T  to  T  +  1  may  include 

(i)  a  constant  number  of  arithmetic  and  set  operations 

(ii)  an  emptiness  test  for  <?T(I)  Ud(?T(I) 

(iii)  choice  of  a  random  element  of  <FL(I) 

T 

(iv)  choice  of  a  "rotation"  element  e’€l  such  that  if 
e€^,(l)  then  I  U  (e)  -  {e'l  €/Q. 

(Of  course  in  the  applications  of  Section  5  on  a  particular  machine 
model  such  as  a  RAM,  we  must  determine  bounds  on  the  number  of  machine 
instructions  per  "unit  time  steps"  of  the  algorithm.) 

Let  H  be  the  size  of  the  independent  set  I  on  exit  (either  by 
successful  termination  or  by  failure).  For  each  h  =  l,2,...,H  let 
be  the  value  of  T  just  after  I  is  extended  from  size  h-1  to  size 
h.  Also,  let  Tq  ■  0  and  let  T^=|Eq|  for  h  =  H+l, . . .  ,hg.  Note  that 
H  and  the  are  random  variables  which  are  fixed  only  for  a  given 

execution  of  the  algorithm  E-P.  on  a  given  instance  Mq  of  the  RIS  M. 

Fix  some  constant  a>l.  For  each  t*0,l,...,|E|  let  e^Oi)  , 
e^th)  ,  X^(h),  X^(h)  be  functions  of  domain  0<h<hQ  and  range  [6,1]. 


'ii  niililte  fail  . ini  iiiWiiiiLnAfcy, . 


We  require  that  for  a  class  of  executions  of  the  Algorithm  E-R 

with  total  probability  >1-  |e|  a, 

(i)  1 1 1 )  <£r {extension  of  1  on  step  t 

|  #t(I)  UJ?t(l)  jp<  0  and  given  an  execution  in  *VQ} 

<et(|i| ) . 

(ii)  X  <  1 1 1 )  <Pr{£  (I)  J&tAl)  -0|given  an  execution  in  *V0> 

*■■■  ■  \  >•••  x  'ji . c  w 

Also  let  Pt(h)  *=  (I  -  Xfc  (h))  *  U  -  t^{h) ) 
and  Pt(h)  -  (1  -  Xfc<h)S)  •  (1  -  et(h) ) . 

Note  that  Pt(h),  Pt(h)  are  functions  such  that  except  for  executions  of 
the  E-R  algorithm  with  total  measure  <  |e|  a, 

Pt(  1 1} )  <pr {rotation  of  I  on  step  t }  <  pfc ( 1 1 1 )  . 

The  above  (somewhat  informal)  statements  can  be  related  t  the 
random  variable  T^  where  h=|l|  by: 

"extension  of  I  on  step  t"  «■*  "T^+1  =  t  +  1" 

"rotation  of  I  on  step  t"  *"*  "Th+1  >  t+1" 

"<?t (I)  Ua?t(I)  =  0"  "Th  =  h0|.« 

Note  that  the  functions  e  (h)  *  X^(h),  X^  (h)  can  always  be 

trivially  defined: 

efc(h)  =  Xt(h)  =  0,  ct(h)  =  Xt(h)  =  1 

so  they  satisfy  the  above  restrictions.  In  practice,  of  course,  we  wish 

|e  (h)  -  et(h) |  and  | X^ (h)  -  Xfc (h) | 

to  be  minimal,  so  that  the  analysis  techniques  of  Section  5  yield 
tight  bounds  on  the  time  complexity  of  the  E-R  algorithm.  In  our  graph 
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applications  tight  et(h),  £t(h),  Xt(h),  ^(h)  are  obtained  in  Sections 
6  and  7  for  matchings  and  Hamiltonian  line  problems. 


4.  A  Simplified  Probabilistic  Analysis  of  the  E-R  Algorithm 

We  describe  here  a  very  simplified  probabilistic  analysis  of  the 
E-R  algorithm.  A  much  more  accurate  analysis  follows  in  the  next 
section. 

The  extension  probability  is  defined  as  the  conditional 


ProbCa  random  e,  chosen  from  d"  (I)  UJP _(I) , 
belongs  to  ^,(1)} 

and  is  equal  to  the  ratio  E(T,I)  *  x/^c+y),  given  that  <^(1)  =  x, 

&J I)  «y. 

T 

The  definition  above,  suggests  that  if  there  are  numbers  x  .  , 

min 

x  ,  y_.  ,  y_  (generally  depending  on  III  such  that  for  some  a>l 
max  min  max 

Prob{x  .  <|<?_(I)|<x  and  y  .  <|^?m(I)<y  } 

min  1  T  max  'min  1  T  ■'max 

is 

>1-  |e|"Q  (*) 


then 


eT(h) 


mm 


x  +  y 
max  max 


<  £ (T, I)  K 


max 


Xmin  +  ymin 


eT<h) 


with  probability  >1-  |e|  a,  and  we  can  use  these  bounds  to  analyze  the 
E-R  algorithm. 


The  existence  of  nontrivial  x  .  ,  x  ,  y  .  ,  y  depends  on 

mm  max  'min  max 

both  the  instance  of  the  random  independence  system  given  as  input  to 
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the  algorithm  and  on  the  particular  random  execution  of  the  E-R 
algorithm  on  that  instance.  Hence,  1-  |e|”°  ia  the  total  probability 
of  a  class  of  "good"  executions  on  a  class  of  "good"  input  instances. 

Let  h  be  the  cardinality  of  I  arid  N  be  the  biggest  |l|  for 
any  such  set  in  ^T.  Suppose  we  could  show  that  property  (*)  is  satisfied 
with  such  numbers  .0  th*t  both  ,nd  V>/<JW  W 

are  approximately  equal  tc  1-h/N.  Then  the  behavior  of  the  E--R 
algorithm  would  be  modelled  by  the  Markov  process  of  Figure  1,  where 
the  numbers  in  the  circles  are  the  possible  |l|.  Thus,  we  would  have 
transition  probabilities 

Prob{  1 1 1  *=  h  +  1  at  T  +  l/|l|=h  at  t}  *  1  -  • 

(Note  that,  with  the  above  assumption,  this  extension  probability  does 
not  depend  on  the  time  T) . 

Let  p(T,h)  be  the  Prob{algcrithm  E-R  achieves  an  independent  set 
I  of  size  h  at  time  t} .  We  get  by  inspection 

p (T ,h)  =  p(T-l,h-l)(l-^)  +  p(T-l.h)*  | 

and 

pJO.O)  *  1 

The  solution  of  the  above  recursion  would  give  the  joint  probability 
density  of  T  and  h  and,  consequently,  we  could  easily  derive  the 
mean  T  for  h  =  N  by 

|e| 

T  =  2j  p(T,N)*T 
T=0 

Let  u^ « mean  time  the  algorithm  stays  at  size  t> ,  before  extending.  By 
known  properties  of  Markov  processes,  we  have 
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-  c  1  N 

uh  1-h/N  *  N-h 

Thus  the  mean  time  of  execution  of  E-R  before  success  is  bounded  by 

f  -  V“i+,"',Vi  +  1 

<  +i  +  ...  +iU  l  .  o(N  log  H)  . 

A 

Note  that  in  most  of  the  applications,  N*  |e|  with  0<6^1. 

The  above  T  was  produced  by  the  assumption  of  a  "good"  class  of 
inputs  and  executions.  In  a  bad  case,  the  algorithm  will  fail  or  stop 
after  time  at  most  |e|,  hence 

^ total  <  5(1  “  +  I E 1  * | E I-0 

and  since  a>l  we  get  as  ( E [  -*,°5  that 

*total  41  0(N  log  N)  ‘ 

This  is  the  phenomenon  approximately  followed  in  the  E-R  algorithm. 
However,  in  general  the  extension  probabilities  depend  also  on  time 
(the  next  section  takes  this  dependence  into  account) . 

In  applications  in  random  giaphs  Gn  (where  usually  I  is  a 
set  of  edges)  we  note  that  T  is  equal  to  the  number  of  edges  examined 
by  T,  and  h  is  equal  to  the  number  of  edges  successfully  extending  I 
by  T.  Hence,  the  number  of  deleted  edges  by  T  is  T-h  and  this 
has  to  be  less  than  or  equal  to  the  number  £  of  edges  from  each  vertex 
of  I  to  all  other  vertices  of  I  (since,  as  we  shall  prove  for  graph 
applications,  we  only  delete  edges  whose  vertices  stay  in  I).  The 
average  £  is  ph(h+l)  and  the  average  T-h  is  ^^total”*1’  By  the 
above,,  in  order  for  the  algorithm  to  achieve  the  maximum  size  N  of  h. 


1? 


we  have 

or  edge  probability 


So,  we  see  that  an  edge  probability  of  at  least  O(log  N/N)  is  necessary 
for  the  E-R  algorithm  to  work  in  graph  problems.  The  constants  of 
multiplication  for  particular  cases  follow  from  the  exact  analysis  given 
in  Section  5  and  also  in  Sections  6  and  7. 


5.  Rigorous  Probabilistic  Analysis  of  the  E-R  Algorithm 

We  fix  an  RIS  (E,^f,p)  throughout  this  section,  and  consider  a 
random  instance  (E^.,  JTQ)  given  as  input  to  tha  E-R  algorithm.  All 
our  applications  of  Sections  6  and  7  satisfy  the  following  mono- 
tonicity  restrictions'. 

R1  £t(h),  et(h)  are  monotonical ly  decreasing  with  h  but 

Increasing  with  t. 

R2  At(h),  At(h)  are  monotonical ly  increasing  with  h  and  t. 

intuitively,  assume  that  the  conditional  probability  of  extension 
decreases  with  h=  |l|  and  that  the  probability  of  failure  increases 
as  1  grows  and  as  the  elements  of  Eq  are  exhausted. 

5.1  Sufficient  Conditions  for  Success  with  High  Likelihood 

Note  that  if  Q  is  predicate  and  A  an  event  on  which  Q  is 
predicated,  we  let  Prob{Q/A)  to  denote  the  conditional  probability  of 
Q  given  that  A  holds. 


.,■*  .  .. .  .. 
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Our  goal  hera  is  to  dsriva  sufficient  conditions  such  that  for  any 
fixed  sufficiently  large  aQ>l, 

Pr{H«hQ}  >  1  -  |e|  0 

(i.e.,  the  E-R  algorithm  succeeds  in  constructing  an  independent  set  of 
site  with  probability  >1-|e|  °)  . 

Assuming  the  above  restrictions  Rl,  R2,  we  can  derive  bounds  for 


EXT^  ■  Pr{H>h|H>h,  t1  *Th+^-l  and  given  an  execution  in 


PROPOSITION  5.1. 


et(h)-(l-Xt,(h)>- 


1-Pt,{h) 


1-Pt,  t>» 


<  EXT. 


<  et,(h)-U-Xt<h))* 


1-Pt(h) 


lEol-t+1 


1  -  Pt(h) 


Unfortunately,  we  found  that  a  direct  derivation  of  pr{H  =  hg)  by 
use  of  Proposition  5.1  is  intractable,  because  of  the  stubborn 
appearance  of  the  random  variables  in  the  conditional  probabilities. 

(Thus  Proposition  5.1,  as  stated,  is  never  used  in  our  analysis  of  the 
E-R  algorithm.  ) 

To  bound  the  random  variable  Eg,  we  may  use  the  following  known 

fact : 

LEMMA  5.1.  If  m  is  an  Ris  (E,  fl,p)  and  (eq,  $ a 
instance  of  M,  then 
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where 


prob{p|£|(l-8)  <  |eq|  <p|e|(1+8)}>1-  |Efa 


Proof.  Recall  that  the  elements  of  Eg  are  chosen  from  E  with 
fixed  probability  p.  Then  this  Lemma  follows  from  the  Chernoff  bounds* 


I  £  | 

£  ('j^pNl-p)  E  “k  <  exp(-62lE|p/3) 

k*f  (1+8)  |e|p! 


l(l-B)  |e|pJ 

E 

k-0 


£E  Pj  (,E|)pka-P>|E|‘k<exp(-B2|E|p/2) 


The  following  two  conditions  in  conjunction  imply 
PrOh{w-b0>>1  -  U  +  cQ)  |  E  j  ~a 

Cl  For  some  fixed  tg, 

(h )  -  Xt(h)  «  0  for  0<h<hg  and 

C2  Prob{T^  <  |  Eg|  }  >  1  -  Cg|E|  a,  for  some  Cg  >  1 . 

Note  that  Cl  does  not  suffice  to  imply  anything  about  Pr{H*hg} 

since  we  may  frequently  fail  if  the  time  t  exceeds  t.  . 

h 

5.2  Verification  of  Condition  C2 

We  now  ass me  that  conditions  Rl,  R2  and  Cl  have  been  verified  for 

some  tn,t. , . . . ,t.  and  derive  bounds  on  the  critical  p  which  insures 
0  1  n 

condition  C2  is  satisfied. 


To  vat if y  C2t  we  require  upper  and  lower  bounds  on  the  diatribution 
of  steps  between  extension!.  Let  g(x,q)  •q(l-q)*  be  the  gsemttr td 
density  function.  Let  be  the  class  of  executions  of  algorithm  E-R 

with  probability  1-  | E { ”a ,  which  were  used  in  the  definition  of  the 

t 

et(h) . 

Also#  let  S  be  the  condition 

.  <t  ,  t  »Tt  <  I E_ I  and  given  an  execution  in 
h+1  h'  h  O'  v 


Proof.  By  conditions  Cl  and  monotonicity  restriction  Rl* 


Pt(h)  *  (l-et(h))<  1  -  eTh^h) 
for  0<h*£hQ  and  Th<t<tfl, 

t+x-1 


Pt{Th+l  *Thssx  +  1^s}<^t+x(h)  "  Pk(h) 

Iv*  V 


X  -i  i'.-  - ’.A;  .. ?  n,n  ic, .«  Jin  J ak. M'-iu -fa. 
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We  now  derive  bounds  on  the  steps  between  extensions.  For 
h  =  0,...,h  and  t=0,...,th  let  St(h)  « KAX(h,h' )  where 


,  ,t+i  eth(h,(1-lEl‘ 1 

h*  »  log  il-e.  (h)  )n  +  - - 

'  f  et(h) 


and  let 


6t(h)  =  log  1 


LEMMA  5.3. 


E  (h)  (1-  |  E  t  ~°)  1  /  ,  A  V 

- |/14-Vh))  * 


pr {6t (h)  <  Th+1  -  Th  <  6 1 (h)  I Th+1 <  th+1 . t  -  Th} > !  -  3  I E I 


Proof .  Recall  that  Pr{given  an  instance  in  a/Q}  >  1  -  |e|  by 


definition. 


It  suffices  to  verify: 


<5(h)-l 


pr(Th+i -Th<6(h)  |s>  =  £  Pr{Th+1-Th  =  x  +  l|s} 


by  Lemma  5 . 2 


e.. «...  ,  (h)  5(h)-l 

>  -  V  e.(h)(l-e  (h)>‘ 

et(h)  x=0 


et+S(h)-l(h) 

et(h) 


et(h)  f 

>  tt—  1-  (1 

et(h)  L 


1  -  (1 


-  et(h))^(h)j 


-  ct(h))^(h>| 


by  R1 


>  1-  E 


by  elementary  calculations, 


rj«j? 
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Similarly,  we  can  show: 


Pr(Th+i  -  Th>  6(h)  |s}>  1  -  | E | _Ct 


As  a  consequence  of  Lemma  5.3,  we  may  use  for  l<h^hr 


h-1 

fi(w  -  n  «A(i)<ii 


i*0 
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Prob{A(h)  <TV<  2(h)  |t.<  t.  }  >  1  -  3h|Era 

n  n  n 

Note  that  we  may  assume  without  loss  of  generality  that  t^<B.  By  the 
monotonicity  condition  Rl,  we  can  show  Pr{T^  =  k}  is  unimodular  for 
k€  {0,1, . . . ,  | E | }  .  Thus 

Pr{Th>  tQ/ 1  Eq  |  <  B}  <Prob{Th  <  A(h)  or  A  (h)  <  ThjTh  <-  th>  •  r  <  3hr  |e|  ~a 

But 

Prob{T,  >  t.  }  <PrlT,  >  t.  |e.  I  <  b}  +  1 F j “C'<  (3hr  +  1)  1 E | _Ct  . 
h  h  h  h  1  01  11  11 

So 

Prob(T,  <A(h)  or  A(h)<T.} 
n  h 

<Prob{T,  <  A (h)  or  A (h)  <  T.  I T.  <  t  }  +  Prob{T.  >  t.  } 
h  n1  h  h  n  h 

<a(h)  |  E  |  ~a .  ° 


Note  that  Theorem  5.1  may  be  restated: 


If  A(h)  <t^  then  Prob(H  >  h}  >  1  -  { E |  where 


“<»>  -  “-te 


a  (h) ) 
E  ) 


Furthermore,  if  we  wish 

-a 

Prob{H>hQ}  >  1  -  |e| 

for  any  given  aQ  sufficiently  large  then  we  find  a  minimal  €  (0,1) 
such  that  the  restrictions  of  Theorem  5.1  are  satisfied  and  ag=a(h). 
(Note  that  if  M  =  (E,  proper  random  independence  system  and  (E  ,$) 

has  rank  >  hp,  then  such  a  always  exists.) 
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5 . 3  Bounds  on  the  Probability  Density  Function  of  T^ 

We  assume  here  the  restrictions  given  in  Theorem  5.1.  Actually,  we 
have  a  much  more  general  result,  since  we  have  from  Lemma  5.2  bounds  on 
the  probability  density  function  of  Th+1 ~ Th  for  h  “  1# • • • #hQ  -  1.  By 
the  monotonicity  restrictions  Rl,  for  x  =  0,...,|e| 

eA(h+l)-l  ‘h)  (1_q(h)) 

<  Prob{Th+1  -  Th  =  x  l|A<h)  <Th<A(h),A(h+l)  <  Th+1  <  A(h+1)  } 

<'*(  h+l)-l(h){1^(h))X 
where 

q(h)  "  eA(h)(h)'  q(h)  =  ^(h)(h)  * 


COROLLARY  5.1.  FOX'  h  *  0,  . .  .  ,h  -  1 


A(ht1)r-l(-)  g(x,q(h))  -  I E I —0t (h+1)  <  pr{Th+1-Th-x  + 1} 


q(h) 


^  e2(h+l)-l(h)  .  I„i-a(h+l) 

< - ^(hj -  g(x,q(h) )  +  I E I 


The  Appendix  gives  the  density  function  of  a  random  variable  which 
is  a  sum  of  variables  with  distinct  geometric  distributions,  and  from 
this  and  by  the  bounds  of  Corollary  5.1,  we  have  upper  and  lower  bounds 
on  the  probability  density  function  of  the  sun: 


h-1 

=  z 

i;=0 


k+1  k 
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THEOREM  5.2.  For  h  0,  . . .  ,hQ  -  1 


where 


Q(h)  .h|E|-a(h+1)  <  Pr{T  =x}<Q(h)  +h|E|"a(h+1) 


and 


h-1  ^  .  .  h-1  *  , 

<?<h)  =  v  £  g(x,q(i))(l-q(i)h‘2  T1  -  - 

’  i=0  j=l  q(i)  -  q( j) 


„  /bjj1  £A(k+l)-l(k)  \ 
\k=0  q(k)  / 


h-1 


h-2 


h-1 


Q(h)  =  w  ^2  g(x,q(i) )  (1  -  q(i)  )**  Ft 


h  *  . 
1=0 


alii 


j=1  q(i)-q(D) 


and 


-  h"L  “A(k+l)-l(k) 

w,  =  II 

h  k.O 


q(k> 


Thus,  if  the  restrictions  of  Theorem  5.1  are  satisfied  (as  they 
do  in  our  applications  in  Sections  6  and  7)  we  can  derive  by 
routine  methods  the  mean,  variance,  and  in  general  any  moment  of  the 
time  cost  or  .  Igorithm  E-R. 
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Problems 

6.1  Motivation  and  Previous  Work 

Posa  11976]  proved  a  sufficient  p«0(log  n/n)  for  Hamiltonian 

paths  in  G  ,  previously  an  open  problem  in  Erdos  and  Spencer  [1974]. 
n#p 

Karp  [1976]  observed  that  Posa's  proof  yields  a  polynomial  time 

algorithm  for  constructing  Hamiltonian  paths  in  a  random  instance  of 

Gn  .  Angluin  and  Valiant  [1979]  then  generalized  this  Posa-Karp 

Algorithm  to  detect  Hamiltonian  paths  in  random  directed  graphs. 

We  can  also  extend  the  ^osa-Karp  Algorithm  to  the  problem  of 

identifying  certain  classes  of  isomorphic  subgraphs.  Consider  the 

problem  for  a  fixed  graph  H  and  random  graph  G  : 

n,p 

Is  H  homeomorphic  to  a  subgraph  of  G  ? 

n,p 

The  answer  to  this  problem  is  very  useful  for  determining  the  probability 
of  a  property  characterizable  by  forbidden  subgraphs  (e.g.,  Kuratowski's 
[1971]  forbidd *n  subgraphs  for  planar  graphs,  Glover  and  Hyneke's  [1975] 
forbidden  subgraphs  for  graphs  imbedded  onto  the  projective  plane, 
Lekkerkerker  and  Roland's  [1962]  forbidden  subgraph  characterization  of 
interval  graphs) .  Erdos  and  Spencer  [1974]  determined  the  probability 
that  a  random  graph  is  planar  by  forbidden  subgraph  methods,  and  Cohen, 
Komlos  and  Mueller  [1979]  found  the  probability  that  a  random  graph  is 
an  interval  graph  by  similar  methods. 

Actually,  we  can  show  that  a  largo  class  of  forbbiden  subgraph 
problems  on  random  graphs  can  be  efficiently  reduced  to  the  problem  of 
determining  a  Hamiltonian  path.  Suppose  H  is  a  graph 
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with  k  edges.  Given  an  instance  G  of  a  random 

graph  we  wish  to  construct  a  subgraph  G'  of  GQ  such  that  G* 

is  homeotnorphic  to  H.  (See  Figure  2). 

We  partition  the  edges  of  G  into  k  blocks  of  cardinality 

n,p 

n/k,  with  each  block  corresponding  to  an  edge  of  H  .  Choose  these 
blocks  tg  so  that  they  have  a  unique  "joining  vertex"  in  common  just 
in  the  case  the  corresponding  edges  of  H  do.  Such  a  partitioning 
requires  only  linear  time  since  k  is  constant.  Then  we  test  (by  the 
Posa-Karp  Algorithm)  if  each  block  of  the  partitioning  has  a  Hamiltonian 
path  between  the  "joining  vertices"  of  the  block.  Each  block  is  con¬ 
sidered  a  random  graph  with  edge  probability  p'  =p/k.  The  application 
of  the  Posa-Karp  Algorithm  then  yields  the  required  Hamiltonian  paths 
in  each  block  with  probability  >  1  -  n  a  for  any  sufficiently  large 

a>l,  if  p>c(k)  i22_J!  and  c(k)>k/2. 

n 

6.2  Analysis  of  the  Posa-Karp  Algorithm 

We  now  give  a  detailed  analysis  of  the  Posa-Karp  Algorithm  for 

detecting  a  Hamiltonian  path  in  a  random  graph  G  .  We  follow  the 

n,p 

analysis  techniques  developed  in  Section  5. 

Step  A:  Formulation  as  an  BIS 

We  will  follow  here  the  formulation  as  a  non-proper  RIS  (see  2.3, 
Examples  of  RIS) .  The  extension  and  rotation  operations  are  described 
in  3.1  of  this  paper.  The  formulation  as  a  proper  RIS  (see  2.3,  3.1) 

leads  to  a  different  algorithm  than  the  algorithm  proposed  by  Karp.  A 
similar  analysis  to  the  analysis  presented  in  this  chapter  can  show  that 


this  new  algorithm  has  the  same  performance  and  the  same  probability  of 
success  as  the  Posa-Karp  Algorithm. 

Step  B:  Derivation  of  the  Bounding  Parameters:  et(h),  g t (h) , 

Xt(h),  Xt(h) 

Let  V  be  the  set  of  n  vertices  of  the  random  graph  G  .  Let 

n,p 

<v  0^  be  an  instance  of  (E,  0 ,p)  given  as  input  to  the  E-R 
algorithm.  Let  I  be  an  independent  set  of  cardinality  h,  constructed 
after  t  steps  of  the  E-R  algorithm.  Recall 

<?T(I)  -  {e€E  /e=  {u,v},u€ENDS(l),v€v-V(I)}  , 

where  V(l)  is  the  vertex  set  of  1.  Thus,  the  structure  of  <fT(I) 
for  a  particular  V(I)  depends  only  on  the  input  instance  (EQ, JTQ) . 

The  E-R  algorithm  does  not  look  at  any  of  these  edges  at  times 
T'<T,  since  if  E-R  examines  an  edge  e  at  time  T'  then  both 
vertices  of  e  stay  permanently  in  V(I)  for  all  T>T'» 

LEMMA  6.1.  For  every  6,  0<B<1  and  for  any  p>c  J?  with 
c  >  0  we  have 


(1-8)  2p(n  -  h)  <  !<?t(I)|  <  (1  +  8)  2p (n  -  h) 


with  probability 

>  1 


-(l-|)s2c/3 
2n  V  n/ 


Proof.  We  have  observed  that  |<?T(I)|  does  not  depend  on  the 

random  variable  T  of  the  algorithm.  It  only  depends  on  |l|  =h. 

By  definition  of  the  G  model 

n,p 


Prcb{|*T<l)|  -  j}  -  ^  (  Pj(l- p)2(n’h)’j  . 

The  Lemma  then  follows  by  the  Chernoff  bounds.  0 

By  Lemma  6.1,  the  mean  value  of  {^(1)1  is  2p(n-h). 

In  the  following  lemma,  |e|  »n(n-l)/2 

LEMMA  6.2.  Let  E  .  *  (1-6)  2p (n-h)  and  E  -  (1+B)  2p(n-h)  with 

mm  max 

0  <  0  <  1  and  p  >  c  •*— 3  -  where  c  >  6/B2 .  Then  there  ie  an  a  >  1  such 

n 

that 

Prob{E  .  <|4L(I)|<E  }>  1-M“a 

nun  1  T'  1  max  1  1 

for  T*=  0,...  ,|e|  . 

2 

Proof .  By  Lemma  1  we  can  get  a  *  B  c/6  so  that  a  >  1  if 
c>6/B2.  O 

In  the  following,  we  consider  edges  examined  by  the  algorithm  but 
not  added  to  I  to  be  deleted . 

lemma  6.3.  71 hen  the  mean  number  of  deleted  edges  per  vertex  of  i 

is  the  same  for  every  v€v(i)  and  is  equal  to  t/h-  1. 

Proof.  Since  the  algoritiim  examines  an  edge  at  each  time  step 
and  since  we  got  up  to  h  edges  at  time  t,  the  number  of  deleted 
edges  is  t-h.  These  edges  have  their  vertices  in  I  (as  previously 
noted) .  So,  it  is  enough  to  show  that  the  mean  number  of  visits  of  the 
E-R  algorithm  to  each  vertex  of  I  by  t  is  the  same.  This  follows 
by  symmetry  and  since  the  algorithm  selects  at  random  an  edge  e  from 
«*.(!)  UdP  (I)  before  each  extension  or  rotation.  Hence  we  get  that  the 


mean  number  of  deleted  edges  per  vertex  of  V(l)  is  t-h/h  proving 
the  lemma.  Note  that  this  holds  for  any  value  of  p. 
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COROLLARY  6.3.  For  any  6  6  (0,1)  there  exist  constants  c  >  1/2 


and  a  >  l  such  that  if  p  >  c 


log  n 


and  m  *  number  of  deleted  edges  per 


vertex  of  V(I)  by  time  t<0(nlogn)  then 


<  m  <  (1+6) t/h 


with  probability  >  1  —  |  E  | -ot . 

Proof .  We  will  first  observe  that  for  any  numbers  k,  A  and 
p>c  ^P3 n.  with  c>2(A  +  k  +  l)  we  have 

Prob{every  vertex  in  G  has  >  k  log  n  edges}  >  l-0(n  A)  . 

n  »p 

To  see  this,  if  v  is  any  node  then  the  probability  that  v  has 

-A-l 

<k  log  n  neighbors  can  be  bounded  by  0(n  )  by  the  Chernoff  bounds. 

The  result  follows  by  summing  over  the  n  choices  of  v  (see  also 
Sociability  Lemma  of  [Angluin,  Valiant,  1979)). 

We  shall  also  utilize  the  bottleneck  lemma  [Angluin,  Valiant,  1979] 
which  can  be  described  as  follows: 

Let  us  have  a  rooted  tree  of  depth  m  and  uniform  S-way  branching. 
Let  Y  be  a  set  of  paths  from  the  root  to  certain  of  the  leaves  of  the 
tree.  Let  us  color  green  all  nodes  in  these  paths.  Assume  that  along 
each  path  of  Y  there  exist  k  nodes  (called  bottlenecks)  such  that 
at  the  i-th  such  node  the  probability  of  drawing  a  green  successor  is 


at  most  p^.  Then, 


BOTTLENECK  lemma.  The  probability  that  a  bottleneck  will  cut  our 
random  path  to  a  green  leaf  is  < pi*p2* *  *  * *pk* 
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We  can  now  complete  the  Proof  of  the  Corollary:  Let  the  tree 
above  be  the  tree  of  possible  executions  of  E-R.  Any  vertex  of  I 
visited  less  than  (1-B) (t/h-1)  times  or  more  than  (1+8) t/h  times 
can  be  considered  as  a  bottleneck  and  this  event  would  be  bounded  by 
the  sum  of  the  probabilities 

[(n-  +  (n-  <!♦»(£- l))"*  103  ” 

for  all  possible  vertices,  which  is  0(n  a)  for  suitable  values  of  k.o 

lemma  6.4.  For  any  B€  (0,1)  there  are  constants  a>l  and 
c  =  c(8i0t)  >0  such  that 

(1-8)  2ph  -  (1+8)  |<fftU)|  <  (l+8)2ph-|^-lj(l-8) 

houds  with  probability  >1-  ]  E | —0t . 

Proof .  Let  A^  be  the  number  of  edges  from  endpoints  of  X  to 
vertices  of  I  at  time  t  =  0  (V(I)  is  fixed  here)  and  let  A^  be 
the  number  of  edges  deleted  from  the  endpoints  of  1  up  to  time  t. 

Then  |dfft(I)  |  ^A^-A^-  By  the  Lemma  6.2  and  Corollary  6.3  we  get  the 
result. 

Applying  Property  (*)  at  the  beginning  of  Chapter  4  and 
Lemmas  6.2,  6.4  we  get  that 


et(h)  = 


n(l+8) 


(n-h) (1-B) 
t  , 


(1-B) 


e.  (h) 

t 


(n-h) (1+8) 

^  (US) 


are  bounds  on  the  conditional  extension  probability  of  the  E-R 


algorithm: 
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et(h)  4  l#t(h)  |  *  l^lhl'l  *  ct(W 

for  executions  in  a  class  u/q  of  total  probability  >1-  |e|~C', 
Observe  that 


3et(h) 

St 


>  0, 


3et(h) 

3t~‘ 


>  0, 


aet(h> 

3h 


<  0 


and 


aet(h) 

3h 


<  0 


so  the  monotonicity  condition  Rl  is  satisfied. 

Note  that  Lemma  5.1  fixes  $  *  /6a  log  | e| /  j E 1  *p  where  |e|  =*n(n-l)/2. 
Note  also  that  since  2ph  is  the  mean  value  of  the  number  of  edges  from 
the  endpoints  of  1  to  the  other  vertices  of  I  at  the  beginning  of  an 
execution  of  E-R  and  since  2t/h  is  the  mean  number  of  deleted  edges 
from  the  endpoints  of  1  by  the  time  t,  we  must  have  (in  order  for  E-R 
not  to  stop  at  t)  that 

t/h  <  ph  or  t  <  ph2  . 

For  TaO(nlogn)  and  h  =  n-1  this  again  implies  p>  0(-°-^  -)  for  the 
E-R  algorithm  to  be  able  to  construct  a  Hamiltonian  line.  (Compare 
with  the  general  statement  at  the  end  of  Section  4.) 

Restriction  R2  can  be  readily  verified  for  it  is  obvious  that 

Prob{<£  (I)  U  df.  (I)  =  4>) 
t  t 

monotonically  increases  with  t  and  h=  |l|. 

To  satisfy  condition  Cl  we  set  t,  ■  2pnh (1-B) .  Then  for  executions 

n 

in  u/q  and  0<t<  th# 

<Pt(I)  UJPt(I)  f  . 
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Step  C:  Verification  of  C2 

We  now  must  verify  condition  C2  to  insure  the  algorithm  succeeds 
with  high  probability.  For  simplicity,  we  proceed  with  the  asymptotio 
analysis  ae  n-*m  (although  the  techniques  of  Section  5  allow  analysis 
for  any  fixed  n  as  well).  Note  that  as  8**0  so 


Et(h)  ~  et(h)  ~ 


n  - 


2ph 


so  in  the  asymptotic  case  the  bounding  parameters  axe  identical. 
Also, 

<s  a  loo  JL 

as  n-*00  , 


8  (h) - 


log(l-et(h)) 


where 


»  n(n-l) 

*  "  r~~  . 


We  must  determine 


A(h+1)  =  A  (h)  +  6g(h)  (h) 


Let 


k.  = 


_ED_ 


1  log  n 


2ttk1 

We  now  show  by  induction  on  h  that  A(h)  ^  k^h  log  n  where  k2  *  2oi~+~k" 
lemma  6.5.  Assume  p>c  — .  Then  A(x)  <kx  log  n,  where 


k  >  2a~+C~c  an< ^  a  ^ 16  oonsiant  appearing  in  6 1 (h) . 

Proof .  We  have  from  the  definition  of  A(h)  that 

h^l  A 

S(h)  =  £  52(i)li) 

i=0 
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and  A (0)  =0.  It  follows  that  A(h)  =  A(h-l)  +  ^2(h-l)  ^h-D  * 


Also  from 


6  (h)  *  . . =2 . 


't'"'  iog(i-et(h)) 


as  n-*00  we  get 


<$t(h) 


-2a  log  n 
log(l-£t(h) ) 


as  n-*00. 


Basis : 

Since  at  0  edges,  E-R  will  increase  the  size  of  I  with  certainty 

A  A  A 

in  the  first  attempt,  we  must  have  5^  (0)  =1.  Then  A(l)  =A(0)  + 

A 

^A(O)  ^  =  1  ^ k  log  n  ^or  *ar9e  n* 


Induction  Hypothesis: 

Assume  that  for  k  ^  ~-r^~  and  all  j  in  {0, 1, . . . ,x-l}  it  is 

A 

true  that  A(j)  <  k  j  log  n. 


Induction  Step: 

A  A 

We  have,  by  replacing  ^  (x-1)  in  the  equation  for  A(x) 


A(x) 


A (x-1) 


+ 


log 


i1- 


-2a  log  n 

_ 2p (n-x+1) 

2pn-22 (x-1) / ( 


x-1) 


By  the  induction  assumption  we  may  substitute  A (x-1)  ^  (x-1) »k  log  n  and 
by  using  also  p^clogn/n  we  get  by  elementary  manipulations  that 


A(x)  <  k(x  +  x' ) log  n 


where 


X'  - 


2g/k 


But  x'  <  0  for  k 


log  [2c- 2k]  -  log 

2a  c 


2c (x-1) 
n 


2k 


2a  +  c 

Thus  S(x)  <kx  log  n. 


and  x^n-1  as  assumed. 
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)c  +  2a  A 

Thus  for  - ■=•  we  have  A(h)  ^t  ,  and  we  conclude  that  the 

^a  -  i  n-i 

-a0 

E-R  algorithm  outputs  a  Hamiltonian  path  with  probability  >1-  |E| 
where  aQ<a-l/2. 


Step  D:  Bounds  on  the  Mecca  and  Variance  of  Th 
We  have  from  Corollary  5.1  that 


Prob{Th+1-Th=x+l}<shq(x,q(h))  +  {e|  ~0C<h+1) 


where 


sh  = 


£A(h+l)-l(h) 

q(h) 


and  q (h)  =  e^(h) <h>  • 


This  requires  calculation  of  the  lower  bound  A(h) ,  which  in  this 
application  is  trivial:  A(h)  =  h.  But  sh  ~  l/(l  -  k2/k^J  is  constant  for 
p  =  9(log  n/n)  ,  Also,  for  a(h+l)>0,  |e|  a^+^->-0  as  |  E  | 

D.a:  Upper  Bound  on  the  Mean  of  Th  for  h^  «=  n-1 
From  the  Lemma  6.5  we  remark  that  the  upper  bound  of  the  mean  must 
be  <knlogn.  To  analytically  derive  a  more  tight  bound,  we  have: 

____  n  -  h _ 

k(h+l)loq  n 

n - r — — — — — 

2ph 

n)  . 


CA(h+l)(h) 


n  -  n 


Ajhjl) 

2ph 


(by  the  fact  A(x)  <  k  x  log 
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So,  we  get 


So, 


£&(h+l) 


(h)  < 


n 


n  -  h 
kn  t  h+1 
2c  *  h 


(using  pn^clogn) 


eS(h+l) 


(h)  < 


n  -  h 


n  -  h  \ 
n  / 


9 


El 

El 

I 

n 


where 


g(h)  -1-5 


Let  us  define  a  constant 


■  .si 

.  3 


Then 


.  ES<h+l) (h)  _  ,, 

sh  ■  — 5(h)  d 


Then,  from  Corollary  5.1  and  the  Appendix  we  get  (by  taking  means)  that 


_  .  ^  ^  .  1  -  q  (h) 

mean(Th+l_V  *  Sh  qitr 


-  d' 


So 


n-h  * 


n-1 


mean 


(Th  )  "  L  mean(Th+l“V  <  d'f  n^h 

0  h=0 


dh 


<  d'  tn  log  n  -  (n-1)  ]  . 


D.b:  Lower  Bounds  on  the  Mean  of  T. 

ho 

Again  we  do  an  asymptotic  analysis  as  n-*-00.  We  have 


r  -  2p(n-h) 

CA(h+l)  h  A(h+1) 

2pn - — 


1-  h/n 

1-  (h+l)/2phn  ' 


by  using  A(h)  * h. 


Since  pn>clogn. 


eA(h+l)(h) 


as  n  00  . 


Also, 


s(h)  . 


(1  -  h/n) 


(l  -  k/2c) 


1 

at  — 

d 


with  d  =  (1  -  k/2c)  . 

By  Corollary  5.1  and  Appendix 


mean(T  .-I.)  >  s(h) - 3 -  ^  f(h) 

h+1  h  ~ ,,  , 

q  (h) 


where 


q  (h)  =  d 


(•■5) 


and  f (h)  = 


n  -  d  (n-h) 
d2 (n-h) 


n-1  n-1 

i(T  >  **  H  mean (T  —  T.  )  =  *  (h)  , 

ho  h=0  h+1  h  h=0 


mean(T.  )  > 

ho 


f  (h)  dh  -  f  (0)  > 

d^ 


I .-li LrnJL!i»id  l!.rl,.:LlBBSjllL.jiild,jJi1iE  ...L>  4.MI  ;  [i.&All«Ll  1.  [Ij..;,,,,!..; 


ft 
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As  n -v the  obtained  lower  and  upper  bounds  are  tight  within  a 
constant  factor 


O'*2  -  &  (&)' 


if  p  -  e 


f  j 

L  i 


Thus, 


»  » 


r  : 
r  ~ 


COROLLARY 


mean(T.  )  *  0(nlogn) 

no 


for  p 


r  = 

e  ■ 

i  l 


D.c:  Upper  Bounds  on  the  Second  Moment  of  T, 


From  the  Appendix 


m 


mean 


(v  >  -  E  PiDi 

i«l 


r.  «>!.+  rJ  8  h 


i  3r.  i  ~  2 
1  8ri j 


where  Y  is  a  sum  of  m  truncated  geometries  of  parameters  p^  and 


i  ■ 


r .  and 

i 


rs+1  -  1 

h(riJ  - 


nQ  the  truncation  point,  and 


s  =  m  n 


0  ' 


n  -1 

D,  »  r.  •  n 
i  i 


h. 


jfi  pi"pj 


In  our  case,  p,  =  q(h)  -  n-h/  So, 

n 


n 

!  A 
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Then 
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_  „  i  .  , . n-i 

Di  ■  B  sz  <'1’ 


We  can  prove  by  an  easy  induction  that  B > exp(d (i-n) ) .  So  we  get 


mean(T2  )  >  £  u{i) 
n-i  ~ 
i*l 


where 


u(i)  «=  exp(d(i-n) )  (-l)""1  A. 

n-i  i 


p.  p. 

*1  *1 


meanCT2^)  >  J"  u(i)  di-u(O) 


A  calculation  of  this  integral  gives  us 


mean (T  , )  >  exp ( 
n-i 


-«[3  a*  *(£-?)*] 


A  lower  bound  on  var(T^  follows  immediately  from  our  bounds  on 
2 

mean(T  ,)  and  mean(T  Hence, 

n-1  n-1 


LEMMA  6.6 


4  e_<*n3  +  ft(n)  <  mean(T2  )  <  —  n3  +  0(n2) 

4  n-1  e 

and  var(T.  )  *  6(n3),  if  exp(d)c  is  ooistant. 

ho 

This  completes  the  analysis  of  the  Fosa-Karp  Algorithm. 


% 


hMiUlfciiiifau  *  ulai  li  JeAlii/.,.' 
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Step  A:  Formulation  as  an  EIS 

We  will  follow  here  the  formulation  as  an  RIS  given  in  2.3 

(Examples  of  RXS) .  The  extension  and  rotation  operations  are  described 

in  3.1.  Let  be  an  instance  of  the  random  graph  B  and  let  I 

o  n,p 

be  an  independent  set  of  size  h,  obtained  after  t  steps  of  the  E-R 
algorithm. 

Step  B:  Derivation  of  the  Bounding  Parameters 

By  the  definition  of  the  rotation  and  extension,  we  note  that  as 

soon  as  an  edge  e  is  examined  by  the  algorithm,  both  its  vertices 

stay  at  X  for  subsequent  time  steps.  Hence,  !<FT(I)  |  follows  the 

same  distribution  (as  in  Lemma  6.1)  with  mean  |<^T(.  ">  j  =p(n-h)  .  Lemma 

1  also  holds  here  (since  it  depends  only  on  the  cardinality  of  I) 

and  Corollary  6.3  can  be  proved  by  similar  arguments.  For  p^c  log  n/n 

we  get  exactly  the  same  values  of  x  .  ,  x  ,  y  .  ,  y  and  the  same 

min  max  ■'min  Jmax 

A 

asymptotic  expressions  for  £t(h) ,  E t On) . 

Steps  C  and  D: 

.ne  analysis  is  the  same  as  in  the  corresponding  steps  of  the 

analysis  of  the  Posa-Karp  algorithm.  So,  we  get: 

If  r  >- c  log  n/n,  the  algorithm  E-R  constructs  a  perfect 

matching  I  with  |l|  ~n  m  the  random  bipartite  graph  B  ,  in 

n  t  p 

>S  ,  -  2  0 

average  time  mean(Tn)  =  6(n  log  n)  ,  with  probability  of  success  ^1-n 
a  > 1.  The  constant  c  depends  on  u  as  in  the  Posa-Karp  case.  The 


Previously,  Angluin  and  Valiant  [1979)  and  Walkup  11977)  have 
described  algorithms  for  detecting  perfect  matchings  in  a  random  graph 
G2n  with  p>c(log  n)/n.  We  now  briefly  sketch  an  analysis  of  the 
performance  of  the  extension-rotation  algorithm  for  perfect  matching. 


Step  A:  Formulation  as  an  RIS 

We  will  follow  the  formulation  given  in  2.3  and  use  the  extension 
and  rotation  as  in  3.1. 

Step  B:  Derivation  of  £t(h),  et(h) 

Let 

a(h)  =  (n-h) (2n-2h-l) 

a'(h)  =  4ph(n-h) 

ft(h)  =  t(n-h-l/2) (n-h)/n2 

f^(h)  =  ht (n-h) /x?  . 

Again,  we  may  use  symmetry  arguments  and  Lemma  5.1  to  bound  the 
cardinalities  of  <5^(1),  «3?t(I)  and  [e^I  for  a  class  of  executions 
»j/q  with  probability  >  1  —  J E |  \  Let  h=  |l|. 

For  executions  in 

(1-8) a (h)  <  |<?t(I)|  +  ft(h)  <  (1+8) a (h) 

(1-8)  a’ (h)  <  |5?t(l)!  +  f^(h)  <  (1+8)  a*  (h)  . 


and 
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th  -  (l-e}(a(h)  +*'(h))  -ft(h)  -f£(h)  . 

Then  |<?t(I)|  4  t^d)  |  >0  for  t<t^  executions  of  .V^,  verifying 
condition  Cl. 

We  nay  let 

(1-8) a (h)  -  ft(h) 

Et(h)  "  (146) (a (h)  4  a • (h) )  -  f t (h)  -  f  £(h) 


et(h) 

so  we  have 

£t(h) 


for  executions  in  ,*/0. 

By  taking  partial  derivatives  of  et(h)  with  respect  to  t  and 
h,  we  can  again  show  the  monotonicity  condition  Hi  is  satisfied,  it  is 
also  obvious  that  monotonicity  condition  R2  holds. 

As  n-*'00,  the  asymptotic  bounds  on  the  conditional  extension 
probability  is  again  tight:  e^th)  ~f't(h)  .  By  the  routine  calculations, 

A 

described  in  Section  5,  the  reader  may  verify  that  A(n)^t^,  so  the 

E-R  algorithm  outpus  a  perfect  matching  with  probability  >1-  |e|  . 

We  also  leave  the  reader  to  calculate  tight  bounds  on  the  mean  and 

variance  of  T  : 

n 

2  3 

mean(T  )  «=  8(nlogn)  and  roean(T  )  =  6(n  ) 
n  n 

by  applying  Corollary  5.1  (which  bounds  the  probability  density  of 

by  geometric  density  functions)  and  using  the  formulas  of  the 
Appendix  to  calculate  the  moments,  as  we  did  in  the  Hamiltonian  path 
applications . 
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Angluin  and  Valiant  11979]  show  that  each  "unit  time"  step  of 

Algorithm  E-R  for  this  application  requires  6 (log  n)  instructions 

on  a  RAM  machine.  Thus,  the  above  mean  and  variance  bounds  must  be 

2 

multiplied  by  a  constant  multiple  of  log  n  and  (log  n)  , 
respectively. 
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APPENDIX 


We  consider  a  random  variable  Y  which  is  a  sum 

Y  -  X,  +  . ..  X 

1  m 

of  geometrically  distributed  variables  X^(...(Xn>  This  Appendix 
provides  formulas  for  the  mean,  variance  and  some  low  order  moments  of 
Y. 

For  each  i«l,,..,m  we  assume  X^  has  truncated  geometric 
density  with  parameter  €  10,1].  Let  r\  ■  1  -  p^  and 

gi(k)  "  Vi'  . nu 

=  0  else 

The  density  function  of  X^  +  Xj  is  for  0<k<2nQ, 

k 

g  *g2(k)  =  g,  (j)g->(k-j) 

j=0 

P1P2  k+l  k+1 

P2~P1  l  1  2 

By  applying  induction,  we  derive  the  density  function  of 
m 

Y  £  X 
i=l  A 


f  (k) 


